Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Runtime][Vulkan] Add RGP support to TVM for vulkan device #10953

Merged
merged 1 commit into from
Apr 13, 2022

Conversation

avulisha
Copy link
Contributor

RGP(Raedon GPU Profiler) is a tool used to analyze the applications
run on AMD GPU. RGP captures the data based on VKPresent and provides
the hardware specific information. Allowing the developer to optimize
the application. To add RGP support to TVM, debug labels "AmdFrameBegin"
and "AmdFrameEnd" need to be inserted into the vulkan queue.These Labels
helps the RGP tool to understand the start|end of frame when no present
is available. Thus enabling the RGP tool to capture and analyze the data.

At runtime, set the envirnoment variable "TVM_USE_AMD_RGP=1" to start
inserting the Debug Labels into the vulkan queue.

Signed-off-by: Wilkin Chau Wing-Ki.ChauWilkin@amd.com
Signed-off-by: Anurag Kumar Vulisha AnuragKumar.Vulisha@amd.com

Thanks for contributing to TVM! Please refer to guideline https://tvm.apache.org/docs/contribute/ for useful information and tips. After the pull request is submitted, please request code reviews from Reviewers by @ them in the pull request thread.

Copy link
Member

@masahi masahi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @avulisha! I'll try get this working on my RX6600xt.

@@ -59,6 +59,14 @@ VulkanInstance::VulkanInstance() {
std::vector<const char*> required_extensions{};
std::vector<const char*> optional_extensions{"VK_KHR_get_physical_device_properties2"};

// Check if RGP support is needed. If needed, enable VK_EXT_debug_utils extension for
// inserting debug labels into the queue.
const char* val = getenv("TVM_USE_AMD_RGP");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use BoolEnvironmentVar

if (support::BoolEnvironmentVar("TVM_VULKAN_ENABLE_VALIDATION_LAYERS")) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use BoolEnvironmentVar

if (support::BoolEnvironmentVar("TVM_VULKAN_ENABLE_VALIDATION_LAYERS")) {

Hi @masahi,
Thanks for your time in reviewing the changes. Will Implement your suggestion.
Thanks,
Anurag

@@ -55,11 +55,15 @@ VulkanStream::VulkanStream(const VulkanDevice* device)
cb_begin.flags = VK_COMMAND_BUFFER_USAGE_ONE_TIME_SUBMIT_BIT;
cb_begin.pInheritanceInfo = 0;
VULKAN_CALL(vkBeginCommandBuffer(state_->cmd_buffer_, &cb_begin));

profiler_ = new AmdRgpProfiler(device_);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do this only when RGP is enabled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do this only when RGP is enabled.

Okay. Will Implement your suggestion.

src/runtime/vulkan/vulkan_stream.h Outdated Show resolved Hide resolved
src/runtime/vulkan/vulkan_wrapped_func.cc Show resolved Hide resolved
RGP(Raedon GPU Profiler) is a tool used to analyze the applications
run on AMD GPU. RGP captures the data based on VKPresent and provides
the hardware specific information. Allowing the developer to optimize
the application. To add RGP support to TVM, debug labels "AmdFrameBegin"
and "AmdFrameEnd" need to be inserted into the vulkan queue.These Labels
helps the RGP tool to understand the start|end of frame when no present
is available. Thus enabling the RGP tool to capture and analyze the data.

At runtime, set the envirnoment variable "TVM_USE_AMD_RGP=1" to start
inserting the Debug Labels into the vulkan queue.

Signed-off-by: Wilkin Chau <Wing-Ki.ChauWilkin@amd.com>
Signed-off-by: Anurag Kumar Vulisha <AnuragKumar.Vulisha@amd.com>
@avulisha
Copy link
Contributor Author

Hi @masahi,
Thanks for your time in reviewing the changes. I have pushed the changes that you have suggested.
Best Regards,
Anurag

@masahi
Copy link
Member

masahi commented Apr 12, 2022

Hi @avulisha (cc @mei-ye), I want to try this. What is your typical workflow? For example, I want to capture the trace from running https://github.com/apache/tvm/blob/main/apps/topi_recipe/gemm/cuda_gemm_square.py.

It looks like I need to press "Capture profile" button in the profiler UI, but the script quickly finishes before I am able to start capturing. So I'm wondering how you typically workaround that issue. I do see tvm/src/runtime/vulkan/vulkan_instance.cc:65: Push VK_EXT_debug_utils logged.

@mei-ye
Copy link
Contributor

mei-ye commented Apr 13, 2022

To successfully capture a trace, it requires at least five complete Present events. Since the inference time is very short (in 10s of ms), a loop with many iterations is normally required to ensure that the capture is completed before the process is terminated.

@avulisha
Copy link
Contributor Author

Hi @avulisha (cc @mei-ye), I want to try this. What is your typical workflow? For example, I want to capture the trace from running https://github.com/apache/tvm/blob/main/apps/topi_recipe/gemm/cuda_gemm_square.py.

It looks like I need to press "Capture profile" button in the profiler UI, but the script quickly finishes before I am able to start capturing. So I'm wondering how you typically workaround that issue. I do see tvm/src/runtime/vulkan/vulkan_instance.cc:65: Push VK_EXT_debug_utils logged.

Hi @masahi ,
As Mei was mentioning, the run is very short for the RGP tool to capture the traces. For testing, we can use the frontend tests to capture the traces. "TVM_FFI=ctypes python -m pytest -v tests/python/frontend/onnx/test_forward.py"
There are many tests that would be run as a part of frontend tests, allowing the RGP tool to capture the traces.
Thanks,
Anurag

Copy link
Member

@masahi masahi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try capturing later.

@masahi masahi merged commit b542724 into apache:main Apr 13, 2022
@masahi
Copy link
Member

masahi commented Apr 13, 2022

Thanks! @avulisha

@avulisha
Copy link
Contributor Author

Thanks! @avulisha
Hi @masahi.
Thanks for your time in reviewing the changes and merging them.
Best Regards,
Anurag

AndrewZhaoLuo added a commit to AndrewZhaoLuo/tvm that referenced this pull request Apr 15, 2022
* main: (527 commits)
  [hexagon] 'add_hvx' test to explore HVX usage. (apache#10604)
  [COMMUNITY] @yzh119 -> Reviewer (apache#10993)
  [Metaschedule] Make custom schedule_rule registration optional (apache#10975)
  [ONNX] Add imports for BERT contrib operators (apache#10949)
  sort axes (apache#10985)
  [Hexagon] Remove HexagonBuffer external constructor and support (apache#10978)
  [CI] Update GPU image (apache#10992)
  [Runtime][Vulkan] Add RGP support to TVM for vulkan device (apache#10953)
  [FIX] resolve int64/32 for AttrStmtNode (apache#10983)
  [TVMC] Allow output module name to be passed as a command line argument (apache#10962)
  [ONNX] Add MatMulInteger importer (apache#10450)
  [COMMUNITY] @guberti -> Reviewer (apache#10976)
  Support `qnn.conv2d` in FoldExplicitPading (apache#10982)
  change Hexagon docker version (apache#10981)
  remove exception handling of autotvm xgboost extract functions (apache#10948)
  [CUDNN] Add partitioning support for conv2d and log_softmax (apache#10961)
  [Hexagon][LLVM] Enable/test tensorized Hexagon DMA on 2d transformed layout (apache#10905)
  [Hexagon] Move aot/graph_executor interactions into launcher (apache#10907)
  [HEXAGON] Split huge 1D DMA Transfers into smaller transfers with legal sizes. (apache#10971)
  [CI][DOCKER] Add pytest-lazy-fixture to images (apache#10970)
  ...
Lucien0 pushed a commit to Lucien0/tvm that referenced this pull request Apr 19, 2022
)

RGP(Raedon GPU Profiler) is a tool used to analyze the applications
run on AMD GPU. RGP captures the data based on VKPresent and provides
the hardware specific information. Allowing the developer to optimize
the application. To add RGP support to TVM, debug labels "AmdFrameBegin"
and "AmdFrameEnd" need to be inserted into the vulkan queue.These Labels
helps the RGP tool to understand the start|end of frame when no present
is available. Thus enabling the RGP tool to capture and analyze the data.

At runtime, set the envirnoment variable "TVM_USE_AMD_RGP=1" to start
inserting the Debug Labels into the vulkan queue.

Signed-off-by: Wilkin Chau <Wing-Ki.ChauWilkin@amd.com>
Signed-off-by: Anurag Kumar Vulisha <AnuragKumar.Vulisha@amd.com>

Co-authored-by: avulisha <avulisha@amd.com>
altanh pushed a commit to altanh/tvm that referenced this pull request Apr 28, 2022
)

RGP(Raedon GPU Profiler) is a tool used to analyze the applications
run on AMD GPU. RGP captures the data based on VKPresent and provides
the hardware specific information. Allowing the developer to optimize
the application. To add RGP support to TVM, debug labels "AmdFrameBegin"
and "AmdFrameEnd" need to be inserted into the vulkan queue.These Labels
helps the RGP tool to understand the start|end of frame when no present
is available. Thus enabling the RGP tool to capture and analyze the data.

At runtime, set the envirnoment variable "TVM_USE_AMD_RGP=1" to start
inserting the Debug Labels into the vulkan queue.

Signed-off-by: Wilkin Chau <Wing-Ki.ChauWilkin@amd.com>
Signed-off-by: Anurag Kumar Vulisha <AnuragKumar.Vulisha@amd.com>

Co-authored-by: avulisha <avulisha@amd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants